Goto

Collaborating Authors

 Decision Tree Learning


Tree in Tree: from Decision Trees to Decision Graphs

Neural Information Processing Systems

Decision trees have been widely used as classifiers in many machine learning applications thanks to their lightweight and interpretable decision process. This paper introduces Tree in Tree decision graph (TnT), a framework that extends the conventional decision tree to a more generic and powerful directed acyclic graph.




439d8c975f26e5005dcdbf41b0d84161-AuthorFeedback.pdf

Neural Information Processing Systems

We thank the reviewers for their thoughtful and valuable feedback. Response to Reviewer 1: We begin by responding to Reviewer #1's remark that the notion of estimating learnability is "The results are not surprising at all. That does not mean that it's easy to prove, but it is still not surprising that it is "Also, the paper makes strong monotonicity assumptions, but does not discuss the implications of it on the strength (and We thank the reviewer for raising this point. Response to Reviewer 2: We thank Reviewer #2 for suggestions for improving our presentation. We thank the reviewer for their question about overall complexity.


Appendix

Neural Information Processing Systems

Organization The supplementary material is organized as follows: Section A presents a brief review of the concepts concerning first-order logic that are used in our work. Section C presents a proof of Theorem 1, our negative result concerning decision trees and OBDDs, while Section D is devoted to our positive result. Next, Section F presents a proof of Theorem 3, implying the full tractability of FOIL for a restricted class of OBDDs. Section G discusses details of the practical implementation, while Section H explains the the methodology of our experiments. Then, Section I discusses details of the high-level version we implemented, and also presents several examples of queries for the Student Performance Data Set which serve to show the usability of our implementation in practice. Finally Section J explains the binarization process for real-valued decision trees and high-level queries. We review the definition of first-order logic (FO) over vocabularies consisting only of relations. We assume the existence of a countably infinite set of variables {x, y, z,... }, possibly with subscripts. The set of FO-formulas over σ is inductively defined as follows. If x, y are variables, then x = y is an FO-formula over σ. 2. If relation symbol R σ has arity n > 0 and x If ϕ, ψ are FO-formulas over σ, then ( ϕ), (ϕ ψ), and (ϕ ψ) are FO-formulas over σ. 4. If x is a variable and ϕ is an FO-formula over σ, then ( x ϕ) and ( x ϕ) are FO-formulas over σ.



Generative Forests

Neural Information Processing Systems

We focus on generative AI for a type of data that still represent one of the most prevalent form of data: tabular data. Our paper introduces two key contributions: a new powerful class of forest-based models fit for such tasks and a simple training algorithm with strong convergence guarantees in a boosting model that parallels that of the original weak / strong supervised learning setting. This algorithm can be implemented by a few tweaks to the most popular induction scheme for decision tree induction (i.e.


Debiased Causal Tree: Heterogeneous Treatment Effects Estimation with Unmeasured Confounding 2

Neural Information Processing Systems

Unmeasured confounding poses a significant threat to the validity of causal inference. Despite that various ad hoc methods are developed to remove confounding effects, they are subject to certain fairly strong assumptions. In this work, we consider the estimation of conditional causal effects in the presence of unmeasured confounding using observational data and historical controls. Under an interpretable transportability condition, we prove the partial identifiability of conditional average treatment effect on the treated group (CATT). For tree-based models, a new notion, confounding entropy, is proposed to measure the discrepancy introduced by unobserved confounders between the conditional outcome distribution of the treated and control groups. The confounding entropy generalizes conventional confounding bias, and can be estimated effectively using historical controls. We develop a new method, debiased causal tree, whose splitting rule is to minimize the empirical risk regularized by the confounding entropy.


No-Regret Bandit Exploration based on Soft Tree Ensemble Model

Neural Information Processing Systems

We propose a novel stochastic bandit algorithm that employs reward estimates using a tree ensemble model. Specifically, our focus is on a soft tree model, a variant of the conventional decision tree that has undergone both practical and theoretical scrutiny in recent years. By deriving several non-trivial properties of soft trees, we extend the existing analytical techniques used for neural bandit algorithms to our soft tree-based algorithm. We demonstrate that our algorithm achieves a smaller cumulative regret compared to the existing ReLU-based neural bandit algorithms. We also show that this advantage comes with a trade-off: the hypothesis space of the soft tree ensemble model is more constrained than that of a ReLU-based neural network.